{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is TensorRT?\n",
    "\n",
    "TensorRT is an optimization tool provided by NVIDIA that applies graph optimization and layer fusion, and finds the fastest implementation of a deep learning model. In other words, TensorRT will optimize our deep learning model so that we expect a faster inference time than the original model (before optimization), such as 5x faster or 2x faster. The bigger model we have, the bigger space for TensorRT to optimize the model. Furthermore, this TensorRT supports all NVIDIA GPU devices, such as 1080Ti, Titan XP for Desktop, and Jetson TX1, TX2 for embedded device."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Standard workflow for optimizing Tensorflow model to TensorRT\n",
    "\n",
    "![alt text](pictures/tf-trt_workflow.png)\n",
    "\n",
    "## Library I use in this video series\n",
    "Pre-requrement: Install TensorRT by following this tutorial [here](https://medium.com/@ardianumam/installing-tensorrt-in-ubuntu-dekstop-1c7307e1dcf6) for Ubuntu dekstop or [here](https://medium.com/@ardianumam/installing-tensorrt-in-jetson-tx2-8d130c4438f5) for Jetson devices",
    "\n",
    "1. Tensorflow 1.12\n",
    "2. OpenCV 3.4.5\n",
    "3. Pillow 5.2.0\n",
    "4. Numpy 1.15.2\n",
    "5. Matplotlib 3.0.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### (a) Read input: Tensorflow model and (b) Convert to frozen model (*.pb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Restoring parameters from ./model/tensorflow/small/model_small\n",
      "INFO:tensorflow:Froze 10 variables.\n",
      "INFO:tensorflow:Converted 10 variables to const ops.\n",
      "WARNING:tensorflow:From <ipython-input-1-963db26f5bec>:26: FastGFile.__init__ (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use tf.gfile.GFile.\n",
      "Frozen model is successfully stored!\n"
     ]
    }
   ],
   "source": [
    "# import the needed libraries\n",
    "import tensorflow as tf\n",
    "import tensorflow.contrib.tensorrt as trt\n",
    "from tensorflow.python.platform import gfile\n",
    "\n",
    "\n",
    "# has to be use this setting to make a session for TensorRT optimization\n",
    "with tf.Session(config=tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.50))) as sess:\n",
    "    # import the meta graph of the tensorflow model\n",
    "    #saver = tf.train.import_meta_graph(\"./model/tensorflow/big/model1.meta\")\n",
    "    saver = tf.train.import_meta_graph(\"./model/tensorflow/small/model_small.meta\")\n",
    "    # then, restore the weights to the meta graph\n",
    "    #saver.restore(sess, \"./model/tensorflow/big/model1\")\n",
    "    saver.restore(sess, \"./model/tensorflow/small/model_small\")\n",
    "    \n",
    "    # specify which tensor output you want to obtain \n",
    "    # (correspond to prediction result)\n",
    "    your_outputs = [\"output_tensor/Softmax\"]\n",
    "    \n",
    "    # convert to frozen model\n",
    "    frozen_graph = tf.graph_util.convert_variables_to_constants(\n",
    "        sess, # session\n",
    "        tf.get_default_graph().as_graph_def(),# graph+weight from the session\n",
    "        output_node_names=your_outputs)\n",
    "    #write the TensorRT model to be used later for inference\n",
    "    with gfile.FastGFile(\"./model/frozen_model.pb\", 'wb') as f:\n",
    "        f.write(frozen_graph.SerializeToString())\n",
    "    print(\"Frozen model is successfully stored!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### (c) Optimize the frozen model to TensorRT graph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Running against TensorRT version 4.0.1\n",
      "TensorRT model is successfully stored!\n"
     ]
    }
   ],
   "source": [
    "# convert (optimize) frozen model to TensorRT model\n",
    "trt_graph = trt.create_inference_graph(\n",
    "    input_graph_def=frozen_graph,# frozen model\n",
    "    outputs=your_outputs,\n",
    "    max_batch_size=2,# specify your max batch size\n",
    "    max_workspace_size_bytes=2*(10**9),# specify the max workspace\n",
    "    precision_mode=\"FP32\") # precision, can be \"FP32\" (32 floating point precision) or \"FP16\"\n",
    "\n",
    "#write the TensorRT model to be used later for inference\n",
    "with gfile.FastGFile(\"./model/TensorRT_model.pb\", 'wb') as f:\n",
    "    f.write(trt_graph.SerializeToString())\n",
    "print(\"TensorRT model is successfully stored!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### (optional) Count how many nodes/operations before and after optimization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "numb. of all_nodes in frozen graph: 46\n",
      "numb. of trt_engine_nodes in TensorRT graph: 2\n",
      "numb. of all_nodes in TensorRT graph: 17\n"
     ]
    }
   ],
   "source": [
    "# check how many ops of the original frozen model\n",
    "all_nodes = len([1 for n in frozen_graph.node])\n",
    "print(\"numb. of all_nodes in frozen graph:\", all_nodes)\n",
    "\n",
    "# check how many ops that is converted to TensorRT engine\n",
    "trt_engine_nodes = len([1 for n in trt_graph.node if str(n.op) == 'TRTEngineOp'])\n",
    "print(\"numb. of trt_engine_nodes in TensorRT graph:\", trt_engine_nodes)\n",
    "all_nodes = len([1 for n in trt_graph.node])\n",
    "print(\"numb. of all_nodes in TensorRT graph:\", all_nodes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:TF1120_GPU]",
   "language": "python",
   "name": "conda-env-TF1120_GPU-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
